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Abstract —Coverage planning and optimization is one of the 
most crncial tasks for a radio network operator. Efficient cov¬ 
erage optimization requires accurate coverage estimation. This 
estimation relies on geo-located field measurements which are 
gathered today during highly expensive drive tests (DT); and will 
be reported in the near future by users’ mobile devices thanks to 
the 3GPP Minimizing Drive Tests (MDT) feature [1]. This feature 
consists in an automatic reporting of the radio measurements 
associated with the geographic location of the user’s mobile 
device. Such a solution is still costly in terms of battery consump¬ 
tion and signaling overhead. Therefore, predicting the coverage 
on a location where no measurements are available remains a 
key and challenging task. This paper describes a powerful tool 
that gives an accurate coverage prediction on the whole area of 
interest: it builds a coverage map by spatially interpolating geo- 
located measurements using the Kriging technique. The paper 
focuses on the reduction of the computational complexity of the 
Kriging algorithm by applying Fixed Rank Kriging (FRK). The 
performance evaluation of the FRK algorithm both on simulated 
measurements and real field measurements shows a good trade¬ 
off between prediction efficiency and computational complexity. 
In order to go a step further towards the operational application 
of the proposed algorithm, a multicellular use-case is studied. 
Simulation results show a good performance in terms of coverage 
prediction and detection of the best serving cell. 

Keywords—Wireless Network, Coverage Map, Radio Environ¬ 
ment Map, Spatial Statistics, Fixed Rank Kriging, Expeetation- 
Maximization algorithm. 

I. Introduction 

Coverage planning and optimization is one of the most 
crucial tasks for a radio network operator. Efficient coverage 
optimization requires accurate coverage estimation. This es¬ 
timation relies on geo-located field measurements, gathered 
today during highly expensive drive tests (DT) and will be 
reported in the near future by users’ mobile devices thanks 
to the 3GPP Minimization of Drive Tests (MDT) feature 
standardized since Release 9 [2]. The radio measurements 
together with the best possible geo-location will be then 
automatically reported to the network by the user’s mobile 
device. Thanks to the integration of Global Positioning System 
(GPS) in the new generation of users’ mobile devices, the geo¬ 
location information is quite accurate. Hence, with MDT, the 
network operator will soon have at his disposal a rich source 
of information that provides a greater insight into the end-user 
perceived quality of service and a better knowledge of the radio 
environment. 
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The collection and exploitation of location aware radio 
measurements was introduced much earlier in the literature in 
the context of the cognitive radio paradigm [3]. The radio En¬ 
vironmental Map (REM) concept was introduced by Zhao [4] 
as a database that stores geo-located radio environmental 
information mainly for opportunistic spectrum access. The 
REM concept was then extended to an entity that not only 
stores geo-located radio information but also post processes 
this information in order to build a complete map. The missing 
information, namely the considered radio metric in locations 
where no measurements are available, is then predicted by 
interpolating the geo-located measurements [5]-[7]. 

The REM was then studied in the framework of European 
Telecommunications Standards Institute (ETSI) as a tool for 
the exploitation of geo-located radio measurements for the 
radio resource management of mobile wireless networks. A 
technical report dedicated to the definition of use-cases for 
building and exploiting the REM gives the following defini¬ 
tion [8]: ’’The Radio Environment Map (REM) defines a set of 
network entities and associated protocols that trigger, perform, 
store and process geo-located radio measurements (received 
signal strength, interference levels, Quality of Service (QoS) 
measurements [...]) and network performance indicators. Such 
measurements are typically performed by user equipments, net¬ 
work entities or dedicated sensors.” In this ETSI report, several 
use-cases for REM exploitation in radio resource management 
are described such as coverage and capacity optimization, and 
interference management especially for the introduction of a 
new technology. 

Inspired by the geo-statistics area, Kriging technique was ap¬ 
plied to REM constmction, mainly for coverage prediction and 
analysis in radio mobile networks [9]-[ll]. Bayesian Kriging 
was first applied to 3G Received Signal Code Power (RSCP) 
coverage prediction in [9], then to Long Term Evolution (LTE) 
Reference Signal Received Power (RSRP) coverage analysis in 
[10]. The description of the bayesian Kriging methodology and 
the algorithm used in [9], [10], is detailed in [11]. These papers 
give promising results in terms of performance. However the 
computational complexity of the algorithm increases cubically 
with the number of measurement points ( 0{N^), where N 

is the number of measurement points). 

In this paper, we aim at providing a method for predict¬ 
ing LTE RSRP coverage map based on MDT data. Given 
the huge number of measurements that will be reported by 
mobile terminals with MDT in the near future, reducing the 
computational complexity of the REM construction becomes 
crucial. In [12], [13], we used the Fixed Rank Kriging (FRK) 
introduced by Cressie in [14] (also called in the literature 
Spatial Random Effects model), as a method to reduce the 
computational complexity of the Kriging technique applied 
to radio coverage prediction; the method was evaluated on 
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simulated data (see [12]) and on real field data (see [13]), 
both in the situation of a single cell with an omni-directional 
antenna. In this paper, we go a step further towards operational 
application of the REM prediction algorithm by considering 
a multicellular use-case: the directivity of the antennas is 
introduced in the model, and both the coverage prediction 
and the good detection of the best serving cell are part of 
the statistical analysis. 

The contribution of this paper can be summarized in the 
following: 

• We describe the FRK algorithm and its adaptation to 
radio coverage data. It requires an estimation step of 
the unknown parameters of the model: we show that the 
method of moments proposed in [14] can not apply and 
we derive a Maximum Likelihood alternative. 

• We extend our model to a multicellular use-case with 
directive antennas. 

• We evaluate the performances of the proposed algo¬ 
rithms both on simulated and real data. 

The paper is organized as follows: Section II starts with an 
overview of the propagation models existing in the literature. 
Then the statistical parametric model is introduced. The last 
part is devoted to the parameter estimation: the applicability of 
the original method is discussed, and an alternative is given. 
In Section III, the extension to the multicellular use-case is 
detailed. Then the numerical analysis in the single cell and 
multicellular use-cases are provided in Section IV. Finally, 
Section V summarizes the main conclusions. 

II. Radio Environment Map prediction models 

In this section, we give an overview of basic propagation 
models and give some notations that will be used in the 
remainder of this paper. Then we introduce a new model for 
REM constmction, which is adapted from the FRK model 
proposed in [14]. 

A. Introduction to propagation modeling and notations 

A radio propagation model describes a relation between the 
signal strength, and the locations of the transmitter and the 
receiver. There are in the literature two different approaches for 
this description which are respectively derived using analytical 
and empirical methods [15]. The analytical approach is based 
on fundamental principals of the radio propagation concept. 
The empirical one introduces a statistical model and uses a set 
of observations to fit this model. The advantage of the second 
approach is the use of actual field measurements to estimate 
the parameters of the model. 

Denote by Z[x) the received power at the receiver end 
located at a; G K^, expressed in dB. The path-loss model, 
also called in the literature the log-distance model, is among 
the analytical approaches. It describes Z{x) as a logarithmi¬ 
cally decreasing function of the distance dist(a:) between the 
transmitter location and the receiver location x (see e.g. [15]): 

Z(a:) = pt — 10Klnio(dist(a:)), a: G (1) 

Pt is the transmitted power in dB and k is the path loss 
exponent. When using this formula to predict the REM, 


Pt is considered as known since it is one of the antenna 
characteristic, and k depends on the propagation environment. 
For example, n is in the order of 2 in free space propagation 
and it is larger when considering an environment with obstacles 
(see e.g. [15], [16]). 

The model in Eq. (1) does not take into account the fact 
that two mobile Equipment (ME) equally distant from the base 
station (BS), may have different environment characteristics. 
To tackle this bottleneck, empirical approaches based on a sta¬ 
tistical modeling of the shadowing effect have been introduced. 
The log-normal shadowing model consists in setting (see [17]) 

Z{x) = Pt — 10K\nio{dist{x)) + cr,^D{x), a: G (2) 

where {i>{x))x, introduced to capture the shadowing effect, 
is a standard Gaussian variable (note that the terminology 
“log-normal” comes from the fact that the shadowing term 
expressed in dB is normally distributed), and ai, > 0. With 
this model, the REM prediction at location x is Z{x) = 
Pt — 10k lnio(dist(a;)). The unknown parameters pt and k 
are estimated from measured data, usually by the maximum 
likelihood estimator (which is also the least-square estimator 
in this Gaussian case). 

Both the models (1) and (2) are large-scale propagation 
models: they do not consider the small fluctuations of the 
received power due to the local environment. The correlated 
shadowing model captures these small-scale variations: 

Z{x) = Pt — 10 k\ nio{dist{x))-\-^{x), x G (3) 

where {u{x))x is a zero mean Gaussian process with a para¬ 
metric spatial covariance function {C{x,x'))x,x' ■ This model 
implies that two signals Z{x), Z{x') at different locations 
x,x' are correlated, with covariance equal to C{x,x'). The 
REM prediction formula based on the model (3) is known 
in the literature as the Kriging (see e.g. [18]): the predic¬ 
tion Z{x) is the conditional expectation of Z{x) given the 
measurements. It depends linearly on these measurements (see 
[18, Eq. (3.2.12)]) and involves a computational cost 0{N^), 
where N is the number of measurement points. Here again, 
the prediction necessitates the estimation of the parameters: 
different parameter estimation approaches were proposed (see 
e.g. [18], [19] for maximum likelihood, or [11], [18] for a 
Bayesian approach). This model was applied to REM inter¬ 
polation in [11], [19], [20] and this technique has proved to 
realize accurate prediction performances. 

All the models above assume that the antennas are omni¬ 
directional. Nevertheless, in macro-cellular networks, operators 
usually deploy directional antennas. Hence, the received power 
depends also on the direction of reception. To fit the model 
to this new constraint, several papers proposed to modify the 
model (2) by adding a term G{x) depending on the mobile 
location x and modeling the antenna gain (see e.g. [21], [22]): 
for X G K^, 

Z{x) = Pt — 10Klnio(dist(x)) -I- G{x) -\- v{x). (4) 

Different gain functions G are proposed, depending on the an¬ 
tenna used for the transmission (for example, a polar antenna, 
a sectorial antenna, ...); see e.g. [22]-[24]. The function G 
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depends on parameters which are usually considered known; 
we will allow the function G to depend on unknown param¬ 
eters to be calibrated from the observations. In this paper, we 
will extend the model (4) by considering a correlated spatial 
noise u{x). 


B. Fixed Rank Kriging prediction model 
For X G K^, Z{x) is assumed of the form 

Z{x) = pt — IOkIuio dist(a;) -I- ‘;G(x) -I- s(x)^r/, (5) 

where s : —)• K'' collects r deterministic spatial basis 

functions and r; is a R’'-valued zero mean Gaussian vector 
with covariance matrix K. denotes the transpose of the 
matrix A and by convention, the vectors are column-vectors. 
Pt — 10k Inio dist(a::) + ‘;G(x) describes the large scale spatial 
variation (i.e. the trend) and the random process {s{x)'^ri)x is 
a smooth small-scale spatial variation. In practice, the number 
of basis functions r and the basis functions s are chosen by 
the user (see [14, Section 4] and Section IV-Bl below). It is 
assumed that the function G is known: in the case of an omni¬ 
directional antenna, G is the null function, and for directional 
antenna we give an example in Section III. 

We have N measurement points t/i,-- - ,yN mod¬ 
eled as the realization of the observation vector Y = 
(Y(a;i),..., Y(xjv))^ at known locations Xi,--- ,Xjv and 
defined as follows 

Y {x) = Z (x) + a e (x), x G R^. (6) 

{s{x))x is assumed to be a zero mean standard Gaussian 
process, it incorporates the uncertainties of the measurement 
technique, rj and {e{x))x are assumed to be independent so 
that the covariance matrix of Y is given by 

S = + SKS'^, ( 7 ) 

where S = (s(xi),..., s{xn))'^ is the x r matrix, and /^r 
denotes the x Y identity matrix. This model implies that the 
conditional distribution of {Z{x))x given the observations Y 
is a Gaussian process. Its expectation and covariance functions 
are respectively given by (see e.g. [25, Appendix A.2]) 

x^t'^{x)a +s{xfKS'^S-\Y-Ta), ( 8 ) 

(x, x') s'^{x)Ks{x') - s{xfKS'^E-^SKs{x'), (9) 


inversion of the matrix S. By using standard matrix formulas 
(see e.g. [26, Section 1.5 , Eq. (18)]) we have 

=a-^lN-(r-^s{cr^K-^ + S'^s'^ ^ . ( 10 ) 

The key property of this FRK model is that it only requires the 
inversion of r x r matrices. Therefore, the computational cost 
for the REM prediction is 0(r^ N) which is a drastic reduction 
when compared to the classical Kriging in situations when N 
is large. The prediction formula also requires the knowledge 
of (a, cr^, K). The goal of the following section is to address 
the estimation of these parameters. 


C. Parameter estimation of the Fixed Rank Kriging model 

We first expose the method described in the original paper 
devoted to the FRK model [14]. We also provide a rigorous 
proof of some weaknesses of this estimation technique pointed 
out in [27] through numerical experiments. We then propose 
a second method which is more robust. 

1} Parameter estimation by a method of moments: In [14], 
a is estimated by the weighted least squares estimator: 
given an estimation {a^,K) of {a'^,K) which yields an 
estimation S of S (see Eq. (7)), we have Awls = 
{T^S T)~^T^S Y. Parameters cr^ and IT are estimated 
by a method of moments: the N observations are replaced with 
M “pseudo-observations” located at x'l^--- ,x'^^ in R^. For 
each i = 1, - ■ ■ , M, a pseudo-observation is constmcted as the 
average of the initial observations Y (x^), £ = 1, • ■ ■ , Y which 
are in a neighborhood of x'. The parameter M is chosen by the 
user such that r < M << Y. An empirical MxM covariance 
matrix Y'm is then associated to these pseudo-observations; it 
is easily invertible due to its reduced dimensions. Finally, the 
same ’’binning” technique is applied to the matrix S which 
yields a Mxr matrix Sm (see [14, Section 3.3.] for a detailed 
constmction of Sm and Sm', see also Appendix A below for 
a partial description). a'^,_K are then estimated by (see [14, 
Eq. (3.10)] applied with V = Im and S = Sm) 

2 ~ Z!m^ 

TV (/m - QQ^) 

K = R-^Q^{Sm - d^lM)Q{R~Y, 


( 11 ) 

( 12 ) 


1 —lOlnio dist(xi) G(xi) 


where T = 


1 —lOlnio dist(xjv) G{xn) 



'Pt 


1 

a = 

K 

, t{x) = 

— lOlnio dist(x) 


A. 


G(x) 


We use the mean value (8) as the estimator Z{x) for 
the unknown quantity Z{x). Note that the estimation of 
(Z(xi),..., Z(xjv))^ is not Y since at locations where we 
have measurements, the prediction technique (8) acts as a 
denoising algorithm. The prediction formula (8) involves the 


where Tr denotes the trace and Sm = QR is the orthogonal- 
triangular decomposition of Sm (Q is a M x r matrix which 
contains the first r columns of a unitary matrix and R is 
an invertible upper triangular matrix). These estimators are 
obtained by fitting Im + SmKS]^ to Sm, solving the 
optimization problem mm ^2 ^K\\SM-<rHM- SmKSUI 
where in this equation, || • || denotes the Froebenius norm 
(to have a better intuition of this strategy, compare this 
criterion to Eq. (7)). K has to be positive definite since it 
estimates an invertible covariance matrix. In [27], the authors 
observe through numerical examples that the estimator (12) is 
a singular covariance matrix (hence, they introduce an “eigen¬ 
value lifting” procedure to modify (12) and obtain a positive 
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definite matrix (see [27, Section 3.2.])). We identify sufficient 
conditions for this empirical observation to be always valid. 
More precisely, we establish in Appendix A the following. 

Proposition 1: Assume that Sm is a full rank matrix and 
let Sm = QR be its orthogonal-triangular decomposition {Q 
is a M X r matrix which collects the first r colurnns of a 
unitary matrix). Denote by {\j)j the eigenvalues of Sm and 
Vj the eigenspace of Aj. Then 

(i) Sm is positive semi-definite. 

(ii) given by ( 11 ) is lower bounded by 

(iii) K given by (12) is positive definite iff cP G 
[ 0 , Aniin(Q^-^M< 3 )) where Aniin(^) denotes the mini¬ 
mal eigenvalue of A. 

We also give in Appendix A a sufficient condition which 
implies that the minimal eigenvalue (say Ai) of Sm is positive. 
If there exists v & Vi such that ||Q^n|| = ||n|| then Q'^v 
is an eigenvector of Q^SmQ associated to the eigenvalue 
Ai (observe indeed that if ||Q^n|| = ||n||, then there exists 
p such that v = Qp and this vector satisfies p = Q^v). 
Therefore, if Ai > 0 and for any v G Vi, ||Q^n|| = ||n|| then 
Proposition 1 implies that K given by (12) can not be positive 
definite. 

2) Parameter estimation by Maximum Likelihood: We pro¬ 
pose to estimate the parameters by the Maximum Likelihood 
Estimator (MLE), following an idea close to that of [28], [29]. 
Observe from (5) and ( 6 ) that Y = To. + Sr] -b ae with 
e = (e(a;i), • • • ,e{xN))'^- This equation shows that from Y, 
it is not possible to estimate a general covariance matrix K 
since roughly speaking, Y is obtained from a single realization 
of a Gaussian vector rj with covariance matrix K. Therefore, 
we introduce a parametric model for this covariance matrix, 
depending on some vector v of low dimension: we will write 
K{ v). We give an example of such a parametric family in 
Section IV-B2; see also [25, Chapter 4]. 

Since rj and {e{x))x are independent processes, Y is a 
R^-valued Gaussian vector with mean Ta and with covari¬ 
ance matrix S = + SK{v)S'^. Therefore the log- 

likelihood L^{6) of the observations Y given the parameters 
0 = {a,cr‘^,v) is, up to an additive constant, 

Ly( 6») = lndet(o-27jv + SK{v)S'^) 

+ ••• 

x(Y-Ta), (13) 

where we used (10) for the expression of S~^. Maximizing 
directly the log-likelihood function 6 hG L^{9) is not straight¬ 
forward and cannot be computed analytically. We therefore 
propose a numerical solution based on the Expectation Maxi¬ 
mization (EM) algorithm [30]. EM allows the computation of 
the MLE in latent data models; in our framework, the latent 
variable is rj. It is an iterative algorithm which produces a 
sequence {0[i))i>o satisfying L^iO^+i)) > Ly{9{i)). This 
property is fundamental for the proof of convergence of any 
EM sequence [31]. Each iteration of EM consists in two steps: 


an Expectation step (E-step) and a Maximization step (M- 
step). Given the current value of the parameter, the E- 
step consists in the computation of the expectation of the log- 
likelihood of {Y,t]) under the conditional distribution of r] 
given Y for the current value of the parameter 0(;): 

Q(0;e(q)=E[lnPr(Y,77;e)|Y;0(,)], 

where 9 i-3- Pr(Y, rj; 9) is the likelihood of (Y, rf). In the M- 
step, the parameter is updated as the value maximizing 9 hG 
Q{9;9(^i'f) or as any value 0 (z+i) satisfying 

Q{9{i+iy,9^i-)) ^ Q{9(^iy,9(^i-^) . (14) 


The E- and M-steps are repeated until convergence, which in 
practice may mean when the difference between || 

changes by an arbitrarily small amount determined by the user 
(see e.g. [30, Chapter 3]). In our framework, we have 


Q{9-, 9) = ^ ln(det(K(v))) - 3 ^ l|Y - 


S'^S 


+ K-\v) E rjrj'^IY-,9 


1 

- 2 ^ 


-b -^(Y-TafSE |r?|Y; 6 »| , 


Taf 

(15) 


where (see e.g. [12, Appendix C]) 

E [r?|Y; ej = (^S'^S + ^ S^ (Y - Ta), 

cov[r,|Y;0] = . 


The update formulas of the parameters {a,a‘^) are given by 
(see e.g. [12, Appendix B] for the proof) 

«(;+!) = (r^T)“'r^(Y-SE[r,|Y;0(z)]) , 

a^z+i) = [||Y - - Sryf |Y;0(q] . 

With this choice, we have (5 (q:(z+i), v; 0(z)) > 

Q{9(j_)\9(ji)), for any v. The update of v is specific to each 
parametric model for K. Upon noting that the first order 
derivative of f - ,Vp) >--> Q(q:, cr^, t>; 0 p)) w.r.t. Vk 

is given by 




dK{v) \ 

dvk J 




K-^(v)E [T,r]^fY;9(i)]K-^(v) 


dK(v) \ 
dvk ) ’ 


(16) 


can be defined as the unique root of this gradient 
whenever it is the global maximum. Another strategy is to 
perform one iteration of a Newton-Raphson algorithm starting 
from with a step size chosen in order to satisfy the EM 
condition (14). See e.g. [30, Section 4.14] for EM combined 
with Newton-Raphson procedures. In Section IV-B2, we will 
give an example of structured covariance matrix and will derive 
the Newton-Raphson strategy to update one of the parameters. 
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III. REM EXTENDED TO MULTICELLULAR NETWORK 


We now consider a multicellular LTE network. In real 
network, UEs measure the received power of several BSs in 
order to choose the best serving one: the UE, this procedure 
is called the cell selection. In LTE, cell selection is applied by 
comparing the instant measured RSRP from all potential cells 
and choosing the cell providing the highest RSRP value [32]. 
In this section, we adapt the ERK model and the REM 
prediction technique described in Section II-B in order to 
address this multicellular use-case. 

We assume that the reported measurements correspond to 
the RSRP of the best serving cell: each measurement consists 
in the RSRP measure, the location information and the corre¬ 
sponding cell identifier (CID). The received power Zi{x) from 
the i-th BS at location x is given by Zi{x) = 0 is x ^ Vi and 
if a; G Vi, 


Zi{x) = pt,i-lOKi\nio{disti{x)) + <;iGi{x) + Si{x)'^'ni (17) 

where Vi C R^, pf i is the transmitted power of the *-th 
BS, Ki is the path loss exponent corresponding to the i-th 
BS and disti(x) is the distance from x to the i-th BS. We 
can choose Vi ^ R^ to model geographic area which are not 
covered by the i-th BS. rji is a Gaussian variable with zero 
mean and covariance matrix Ki. Si{x) : R^ —>■ R’’' collects 
Ci deterministic spatial basis functions. 

<;iGi{x) is the antenna gain which depends on the mobile 
location x. In our use-case, the antennas used for each BS are 
tri-sectored; we use a typical antenna pattern proposed in the 
3 GPP standard [1] with a horizontal gain only since we are 
using a 2-dimensional model: 


Gi (x) = — min 


12 

V WZdB J 


(18) 


where ipx,i is the angle between the UE location x, and the 
i-th BS antenna azimuth. ipsdB denotes the angle at which the 
antenna efficiency is 50% and Am is the maximum antenna 
gain. Eor a tri-sectorial antenna, the parameter ipsdB is usually 
taken equal to 65° and Am = 30dB. 

We have Ni observations Yi{x) having the z-th BS as 
the best serving cell. They are located at aJi*,-- - 
and are noisy measurements of Zi{x): Yi{x) = Zi{x) + 
aiSi^x) where {ei{x))x is a zero mean standard Gaussian 
process, independent of rji. Eollowing the same lines as in 
section II-B, we define the x 1 column vector = 
{Yi{xid), ■■■ , Yi{xNi,i)Y, and have = TiOLi+Sirji+aiEi 
where 

1 -101nio(disti(a::i,i)) Gi{xid) 

I ; ; , 

_1 -101nio(disti(x]Vi,i)) Gi{xNi,i)_ 


'Pt,i 



Ki 

5 — 


. . 




The parameters Pt,i,Ki,cri,<;i and Ki are unknown and are 


estimated from Y^ by applying the EM technique described 
in Section II-C (see also Section IV-B for the implementation). 

Eor any x such that x G Vi, set Zi{x) = E [Zi{x)\Yi], the 
expression of which can easily be adapted from (8). In the 
multicellular case, the inter-site shadowing correlation can be 
explained by a partial overlap of the large-scale propagation 
medium as explained in [33]. Hence, for any x such that 
X G Vi, we write Zi{x) = Z[{x) -\- W{x), where W{x) is the 
random cross-correlated shadowing term which depends only 
on the mobile location (also called overlapping propagation 
term) and Z'i[x) is the random correlated shadowing related 
to the z-th BS at the location x (also called non-overlapping 
propagation term). As explained in [33], the r.v. {Zl{x))i are 
independent, which implies that the probability that a UE 
located at x is attached to the z-th BS (which is denoted by 
CID (a:) = z) is given by 


P(CID(a:) 


z) =E 






(19) 


A simple approximation consists in approximating this expec¬ 
tation by 

n '^Zi(x)<Zi{xy 

j^yi-.x^Vj 


This yields the estimation rules for the CID and the RSRP 
value at x 


CID(a::) = argmaXj.^gi,,Zj(2:), 

= ^cro(.)(2;) = 


Zj{x). 


IV. Applications to cellular coverage map 
A. Data sets description 

Eor the single cell use-case, we consider a simulated data 
set and a real data set. The first data set consists of simulated 
measurement points generated with a very accurate planning 
tool, which uses a sophisticated ray-tracing propagation model 
developed for operational network planning [34]. This data is 
considered as the ground-tmth of the coverage in the area 
of interest. The collected data set corresponds to the LTE 
RSRP values in an urban scenario located in the Southwest 
of Paris (Prance). The environment is covered by a macro¬ 
cell with an omni-directional antenna. These measurement 
points are located on a 1000 mxlOOO m surface, regularly 
spaced on a cartesian grid consisting of 5 m x5 m squares; 
this yields a total of 40401 measurement points (see Pig. la, 
where the antenna location is (595416 m, 2 425 341m)). In 
order to model the noise measurements, a zero mean Gaussian 
noise with variance equal to 3 dB is added to the simulated 
measurements. This yields what we called in Section II the 
process {Y{x),x G V}, where V C R^. 

The second data set corresponds to real measurement points 
reported from Drive Tests (DT) done by Orange Prance teams, 
in a rural area located in southwestern Prance. The BS is 
about 30 m height and covers an area of 22 kmxlO km. 
7800 measurement points have been collected in the 800 
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MHz frequency band using a typical user’s mobile device 
connected to a software tool for data acquisition.The locations 
of the measurement points are shown on Fig. lb - note 
that they are along the roads and the antenna is located at 
(408 238 m, 1864600 m). For the multicellular use-case, we 
consider a simulated data set provided by the aforementioned 
Orange planning tool. This planning tool calculates RSRP 
values in a sub-urban environment shown in Fig. 2a, consisting 
of 12 antennas grouped into 4 sites of 3 directional antennas. 
The inter-site distance is bigger than 1 km. The antennas are 
tri-sectored. The RSRP values are computed over a regular 
grid of size 25 mx25 m over a 12.4 km^ geographic area, 
which results in a total of 20 008 locations; and it is realized 
over a 2.6 GHz frequency band. The planning tool returns, at 
each location of the regular grid, both the RSRP value and the 
ID of the best serving cell. Fig. 2b displays the RSRP values 
and Fig. 2c shows the best serving cell map where each color 
corresponds to a cell coverage area. 


B. EM implementation 

1) Choice of the basis functions s: The basis functions 
X i-j- s{x) = {Si{x),..., Sr{x)) and their number r both 
control the complexity and the accuracy of the FRK prediction 
technique. Following the suggestions in [14], we choose the l- 
th basis function a: H- S'; (x) as a symmetric function centered 
at locations xj: Si is a bi-square function defined as 


= |[l-(Ik-a^dl/r)^] , if ||x-x(|| < r , 

[O, otherwise . 

( 20 ) 

The parameter r controls the support of the function. In the 
numerical applications below, the centers of the basis functions 
x'l and their number r are chosen as follows: Xmax functions 
are located on a Cartesian grid where the elements are t x t 
squares covering the whole geographic area of interest. Then, 
for each function Si, if none of the N locations xi, ■ ■ • , xjv is 
in a T-neighborhood of the center x(, this function is removed. 
The number of the remaining basis function is r. On Fig. 3a 
and Fig. 3b, we show the locations of the N observations (red 
circle) and the locations of the r basis function centers (blue 
crosses) for two different data sets. In Fig. 3a, t = 100 m 
and r = Xmax (and N = 2000) while in Fig. 3b, r = 250 m, 
’’max = 2660 and x = 467. 

2 ) Structured covariance matrix K: Several examples of 
structured covariance matrix K can be chosen. In the radio 
cellular context, the shadowing term can be modeled as a 
zero-mean Gaussian random variable with an exponential 
correlation model [35]. Thus, K is given by 


K{PA) 


K{f) 

P 


with Kij{(j)) = exp 


\X, — Xa 


exp(((>) 


( 21 ) 

( 22 ) 


where ||x' — x' || is the Euclidean distance between the two 
locations x' and x' (related to the basis functions, see Sec¬ 
tion IV-Bl). \/P and exp(())) are respectively the variance of 


Vi^ ^ S I S r; and a rate of decay of the correlation (the 
choice of the parametrization ex-p{4>) avoids the introduction 
of a constraint of sign when estimating </>). We therefore have 

V = {P, p) G K* X R. For this specific parametric matrix (21- 

22), a possible update of the parameters (/), p) which ensures 
the monotonicity property of the EM algorithm is (see e.g. [12, 
Appendix B]): /?(;+!) = x/Tr V(i)^ and 

P(i+t)=Pii)-^^--- 

X Tr - Ir) A o 

where K^i) is a shorthand notation for K{p(iP), A is the x x x 
matrix with entries (||x' — x'j\\)ij, Vp) is a shorthand notation 
for E [7777^[Y; d{i)], o denotes the Hadamard product and 

Tfp) = -TV A O k (/3(i+i)XiV(q - k)) 

+ exp(-())(/))Tl- (^ki ^ AoAo ki - Ir'j'j 

+exp(-())(/))Ti- (^(^ki ^ A o ki^ (^Ir - 2/3(,+i)K;V(i)^^ ; 

a(q e (0,1) is chosen so that (5(0(;+i); 0(q) > 

Q{0(i)\0(^i)). 

3 ) EM convergence: EM converges whatever the initial 
value 0(0) (see [31]); the limiting points of the EM sequences 
are the stationary points of the log-likelihood of the observa¬ 
tions Y. We did not observe that the initialization 0(o) plays 
a role on the limiting value of our EM mns. A natural initial 
value for a is the Ordinary Least Square estimator given by 

q:(o) = {t^T^ T^Y. We choose 0(0) large enough so that 

the matrix k{p(^op looks like the identity matrix; in practice, 
we choose t/ exp(0) in the order of 5. Einally, we compute the 
empirical variance V of the components of the residual vector 

Y — T'a(o) and choose -|- = V; roughly speaking, 

we start from a model with uncorrelated shadowing term. The 
algorithm is stopped when ||0(;) — 0(i_i)|| < 10“^ over 100 
successive iterations. We report in Table I the values of the 
parameters at convergence of EM for the simulated data set. 

TABLE 1. Simulated data set, when r = 50 m, r = 400 and 
JV = 32000 



a 

1/^ 

P 

18.15 

-49. 5512.73 

12.5 

3.63 


C. Prediction Error Analysis for the single cell use-case 

Each data set is splitted into a learning set and a test set. 
Using the data in the learning set, the parameters are estimated 
by the method described in Section II-C. The performances 
are then evaluated using the data in the test set. In order to 
make this analysis more robust to the choice of the learning 
and test sets, we perform a k-fold cross validation [36] (here, 
we choose k = 5) with a uniform data sampling of the 









7 



595000 595400 595800 

(m) 

(a) Simulated data set. 

Fig. 1. One cell case: the measurements (y(a;))a;. 



—^^- f — 

400000 410000 

(m) 


(b) Rural data set. 



(a) (b) (c) 

Fig. 2. Multicellular case: (a) BS locations; (b) the simulated RSRP map; (c) measurements grouped in 12 clusters, according to their best serving cell ID 




2.4259 

2.4257 

2.4255 

2.4253 

2.4251 

2.4249 


(a) Simulated data set 


(b) Real data set 




Fig. 3. Locations of the N observations (red circles) and locations of the r 
centers (blue crosses) of the basis functions. 


subsets (typical values for k are in the range 3 to 10 [25, see 
Section 5.3.]). Therefore, at each step of this cross-validation 
procedure, we have a learning set consisting of 80% of the 
available measurement points (making a learning sets with 
resp. 32000 and 6000 points for resp. the simulated data set 


and the real data set). 

In order to evaluate the prediction accuracy, we compare 
the measurements Y{x) to the predicted values Y{x) from 
the model (6). We consider the locations x in the test set 
T. The model (6) implies that the conditional expectation of 
Y (x) given Y at such locations x is equal to the conditional 
expectation of Z{x) given Y since e{x) is independent of 
Y. Therefore, for any x G T, the error (with sign) is 
Y{x) — Y{x) = Z{x) — Y{x) where Z{x) is given by (8). 
We evaluate the Root Mean Square Error (RMSE) which is 
a commonly used prediction error indicator (see e.g. [37]), 
defined as 


RMSE = 


' ' x€T 


(23) 


where \T\ denotes the number of observations in the test set 
T. The RMSE is computed for each of the k successive test 
sets in the cross-validation analysis. In Tables II and III, we 
report the mean value of the RMSE over the k partitions and 
its standard deviation in parenthesis. We compare different 
strategies for modeling the observations {Y{x))x, for the 
























parameter estimation of the model and for the prediction: 

• Log-Normal: the log-normal shadowing model (see 

(2)) when the parameters are estimated by 

MLE. Z{x) is given by pt — 10k log]^g(dist(2:)); this 
method does not depend on r. 

• FRK: the FRK model (see section II-B) when the param¬ 
eters are estimated by MLE (see Sections II-C and IV-B) 
and Z [x) is given by (8), for different values of r. 

In tables II and III, we report the mean RMSE over the k splits 
of the data set and its standard deviation between parenthesis. 

These tables show that the FRK model improves on the log- 

TABLE II. Simulated data set: Mean RMSE and Standard 

DEVIATION IN PARENTHESIS. 


Log-Normal 

FRK FRK 

r = 1089 r = 100 

5.08 

(6.08e-02) 

3.98 4.67 

(5.18e-02) (4.46e-02) 


TABLE III. Real data set: Mean RMSE and Standard 
DEVIATION IN PARENTHESIS 


Log-Normal 

FRK FRK 

r = 1000 r = 150 

8.95 

(1.46e-01) 

3.51 5.57 

(1.24e-01) (6.23e-02) 


normal model. For the real data set, it yields a considerably 
low RMSE (in the order of 3 — 5 dB) when compared to the 
log-normal shadowing model which has a RMSE in the order 
of 9 dB. For the simulated data set, we have a similar behavior. 

The computational complexity of the FRK approach is 
essentially related to r, the number of basis functions. On 
the one hand, the computational cost increases with r and 
on the other hand, the prediction accuracy increases with 
r. We report on Fig. 4 the running time and the predic¬ 
tion accuracy measured in terms of mean RMSE over the 
k splitting of the data set into a learning and a test set, 
as a function of r; by convention, the mnning time is set 
to 1 when r = 64. The plot is obtained with 7 different 
analysis, obtained with t G {30,40, 50,60,80,100,120} - or 
equivalently, r G {1089,625,400,289,169,100,64}. It shows 
that the mnning time is multiplied by a factor 130 and the 
prediction accuracy is increased by 20% when moving from 
r = 120 (r = 64) to r = 30 (r = 1089). 

D. Prediction Error Analysis for the multicellular use-case 

The data set is splitted into a learning set with 16 000 
points and a test set. Based on their best serving cell ID, 
these 16 000 points are clustered into 12 subsets. The size 
of these subsets varies between 1000 and 3500. In Fig. 5a 
a learning subset associated to a given BS is displayed: note 
that the observations with a given best serving cell ID are not 
uniformly distributed over the geographical area of interest. 
We choose the same initial basis functions for the 12 sub¬ 
models (defined by Eq.(20) with t = 150, which yields 



Fig. 4. Simulated Data set: for different values of r, the running time and 
the mean RMSE 


Tmax = 588). For each sub-model, some of the basis functions 
are canceled as described in Section IV-B I (see the blue circles 
and black dots in Fig. 5a). Fig. 5b shows the path-loss function 
X I—I- pt^i — lOKi Inio disti(a:) -|-CiGi(a:): note that, as expected, 
TiCt.i is bigger in the direction of the antenna spread. In 
Fig. 5c, we display {Zi{x),x G Vi}. Vi is defined as the 
area covering the main direction of the i-th antenna radiation. 

The best serving cell ID (CIDbs) for any location a: G Di is 
defined as the ID of the BS having the biggest probability that 
the ME is attached to it at location x as detailed in Eq. 19. 
Then the predicted received power at location x corresponds 
to the predicted received power of the best serving cell at 
that location. For performance evaluation, we first consider 
an omni-directional antenna model (similar to the one in 
section IV-C). We compare the predicted cell ID for each 
location x (that is the index j such that Z{x) = Zj[x)) 
to the real one. We obtain an error rate of 53% over the 
locations x in the test set. When we consider the domain 
clustering introduced in (17) (the antennas are still assumed 
to be omnidirectional), the error rate on cell ID selection is 
31.23% over the test set locations. Finally, we consider the 
directional model together with the same domain restriction 
Vi. The error rate is drastically decreased to 12.64%. This error 
rate is expected to further decrease when using real antenna 
patterns (the impact of approximating real antenna patterns 
with the 3GPP model is studied for example in [38]). 

V. Conclusion 

In this work we have studied the performance of the 
FRK algorithm applied to coverage analysis in cellular net¬ 
works. This method has a good potential when performing 
prediction using massive data sets (order of thousands and 
higher) as it offers a good trade-off between prediction quality 
and computational complexity compared to classical Kriging 
techniques. This study has been performed using field-like 
measurements obtained from an accurate planning tool and 
real field measurements obtained from drive tests. In addition 
we have adapted the model to a more practical application: we 
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(a) For a given BS: the associated obser- (b) x i —TiOLi 

vations Yj {red crosses) and the locations 
of the Vi basis functions (black dots). 


Fig. 5. Multicellular case: results for a given best serving cell ID i 



- -40 

- -60 
- -80 

- -100 
- -120 
- -140 


(m) 


(c) estimated field (Zi{x))x, for x E T)i 
(the black points correspond to locations 
X not in T>i). 


used field-like measurements over several cells with directive 
antennas. Simulation results show a good performance in terms 
of coverage prediction and detection of the best serving cell. 
In future works, we target to further improve this performance 
by using real antenna patterns. Finally, our ongoing research 
focuses on extending the model to take into account the loca¬ 
tion uncertainty and on studying its impact on the prediction 
performances. 

Appendix A 

Proof of Proposition 1 

We recall some notations introduced in [14, Appendix A], 
which will be useful for the proof of Proposition 1. For 
j = !,■■■ ,M, set Wj = {Wji,... ,WjN)'^, where Wij 
is the weight associated to the observation Y{xj) in the 
neighborhood of the bin center xj (see [14] for the expression 
of these non negative weights). Define the vector of residual 
D = [Di,--- ,DAr)^ = Y-T(T^)-iT^Y, an_d associate 
an aggregated vector of residuals D = {Di, ■ ■ ■ , Dm)"'" and 
a weighted square residuals 

p T^=iWuDi WjD Zl.We.Dl 

1 AT is the Y X 1 vector of ones. The M x M matrix X'm is 
defined by (see [14, Eq. (A.2)]) 

X'jvf (Z, k) = DiDk, for I k, SM(k, k) = Vk- (24) 

Proof of Proposition 1 (i) Let p, = {pi, ■ ■ ■ , pm) G . From 
(24), 

( M \^M 

'^pidA + p^ (vi ~ 

1=1 / 1=1 

M 

1=1 


_2 

The Jensen’s inequality implies that Vi > Di for any I 
thus showing that p^ SmP > 0. Note also that this term 
is positive for any non null vector p iff Vi — Di >0 for 
any 1. (ii) Since Sm is a covariance matrix, there exists an 
orthogonal M x M matrix U and a diag^onal M x M matrix 
A with diagonal entries (Ai); such that Sm = UAU"". Since 
Tr(Ai3) = Tr(i?A), we have 

M 

TA ((7m - QQ'^)UAU^) = ^ 

i=l 

where B = U''"{Im — QQ^)U. Assume that Bn > 0 for 
any i. Then 

Trf(7M> ( inf A, ) Tr(B). 

^ 2 \j-.Bn>0 J 

Since Tr(B) = Tr((7M - QQ^)UU'^) = Tt{Im - QQ^), 
we have > (infj:Bjj>o A^). Let us prove that Bn > 0 
for any i: for p G K^, p'^Bp = ||7/^||^ — \\Q''"{Up)\\^ 
and this term is non negative since Q^{Up) is the orthogonal 
projection of (Up) on the column space of Q (or equivalently, 
of Sm)- This equality also shows that 

{j:B,,>0} = {j-.3veyj,\\vf>\\Q^vf} 

= {j-.3veVj,\\iSii)^v\\>0}. 

(hi) Since Sm is a full rank matrix, R is invertible. There¬ 
fore, from (12), it is trivial that K is positive definite iff 
Q""{Bm — d'^lM)Q is positive definite. Since = Ir, 

we have for any p e K'', p 0: p^ [O'"BmQ — cT^Ir)p > 0 
iff p^iQ'^BMQ)p>a^ \\p\\^. 


Remark.: It can be seen from the proof of (i) that Bm 
is positive definite iff for any I, Wi has at least two non null 
components (say such that Dj^. 
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